Runtime Address Space Computation for SDSM Systems
نویسندگان
چکیده
This paper explores the benefits and limitations of using a inspector/executor approach for Software Distributed Shared Memory (SDSM) systems. The role of the inspector is to obtain a description of the address space accessed during the execution of parallel loops. The information collected by the inspector will enable the runtime to optimize the movement of shared data that will happen during the executor phase. This paper addresses the main issues that have been considered to embed an inspector/executor model in a SDSM system: amount of data collected by the inspector, the accurateness of this data when the loop has data and/or control dependences, and the computational overhead introduced. The paper also includes a description of the SDSM system where the inspector/executor model has been embedded. The proposal is evaluated with four applications from the NAS benchmark suite. The evaluation shows that the accuracy of the inspection and the small overheads introduced by the approach allow its use in a SDSM system.
منابع مشابه
OpenMP compiler for a Software Distributed Shared Memory System SCASH
In this paper, we present an implementation of OpenMP compiler for a page-based software distributed shared memory system, SCASH on a cluster of PCs. For programming distributed memory multiprocessors such as clusters of PC/WS and MPP, message passing is usually used. A message passing system requires programmers to explicitly code the communication and makes writing parallel programs cumbersom...
متن کاملRaptor: Integrating Checkpoints and Thread Migration for Cluster Management
Software distributed shared-memory (SDSM) provides the abstraction necessary to run shared-memory applications on cost-effective parallel platforms such as clusters of workstations. However, problems such as cluster component reliability and cluster management, which are not directly related to performance, need to be addressed before SDSM solutions can be widely adopted. This paper presents Ra...
متن کاملCoherence-based Coordinated Checkpointing for Software Distributed Shared Memory Systems
Fault-tolerant techniques that can cope with system failures in software distributed shared memory (SDSM) are essential for creating productive and highly available parallel computing environments on clusters of workstations. In this paper, we propose a new, efficient coordinated checkpointing technique, called coherence-based coordinated checkpointing (CCC), for SDSM. Our CCC minimizes both th...
متن کاملXpressSpace: a programming framework for coupling partitioned global address space simulation codes
Complex coupled multiphysics simulations are playing increasingly important roles in scientific and engineering applications such as fusion, combustion, and climate modeling. At the same time, extreme scales, increased levels of concurrency, and the advent of multicores are making programming of high-end parallel computing systems on which these simulations run challenging. Although partitioned...
متن کاملThe Nexus Task-parallel Runtime System the Nexus Task-parallel Runtime System
A runtime system provides a parallel language compiler with an interface to the low-level facilities required to support interaction between concurrently executing program components. Nexus is a portable runtime system for task-parallel programming languages. Distinguishing features of Nexus include its support for multiple threads of control, dynamic processor acquisition, dynamic address spac...
متن کامل